Attack of the Mutant Suffix Trees

نویسندگان

  • N. JESPER LARSSON
  • N. Jesper Larsson
  • Arne Andersson
  • Kurt Swanson
  • Stefan Nilsson
  • Jan Åberg
چکیده

This is a thesis for the degree of filosofie licentiat (a Swedish degree between Master of Science and Ph.D.). It comprises three articles, all treating variations and augmentations of suffix trees, and the capability of the suffix tree data structure to efficiently capture similarities between different parts of a string. The presented applications are in the areas of data compression and pattern matching. The first article presents a method of using a suffix tree as an index into a sliding window, in optimal time and space, with applications in probabilistic data compression schemes, such as PPM, as well as in ZivLempel compression. The second article presents a generalized suffix tree which is able to index a select subset of the suffixes of a string, with a construction algorithm which is optimal in time and space. The final article deals with the connection between the explicit context trees of PPM and the implicit mechanism of the Burrows-Wheeler transform for block sorting compression, again utilizing the suffix tree to capture the repetitive characteristics of a string. This document was revised on February 3 1998. The differences from the presented thesis are in corrections of trivial errors only.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth

Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...

متن کامل

Faster Suffix Tree Construction with Missing

We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new backpropagation component to McCreight’s algorithm and also give a high probability hashing scheme for large degre...

متن کامل

Computing suffix links for suffix trees and arrays

We present a new and simple algorithm to reconstruct suffix links in suffix trees and suffix arrays. The algorithm is based on observations regarding suffix tree construction algorithms. With our algorithm we bring suffix arrays even closer to the ease of use and implementation of suffix trees.

متن کامل

Efficient Algorithms for Intrusion Detection

Detecting user to root attacks is an important intrusion detection task. This paper uses a mix of spectrum kernels and probabilistic suffix trees as a possible solution for detecting such intrusions efficiently. Experimental results on two real world datasets show that the proposed approach outperforms the state of the art Fisher kernel based methods in terms of speed with no loss of accuracy.

متن کامل

Constructing Chromosome Scale Suffix Trees

Suffix trees have been the focus of significant research interest as they permit very efficient solutions to a range of string and sequence searching problems. Given a suffix tree that encodes a particular string, it is possible to solve problems such as searching for a specific pattern in time proportional to the length of the pattern rather than the length of the string. Suffix trees can also...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998